858 research outputs found

    The identification of informative genes from multiple datasets with increasing complexity

    Get PDF
    Background In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datasets derived from related biological systems leads to more robust models. Therefore, we developed a novel framework for modelling regulatory networks that involves training and evaluation on independent datasets. Our approach includes the following steps: (1) ordering the datasets based on their level of noise and informativeness; (2) selection of a Bayesian classifier with an appropriate level of complexity by evaluation of predictive performance on independent data sets; (3) comparing the different gene selections and the influence of increasing the model complexity; (4) functional analysis of the informative genes. Results In this paper, we identify the most appropriate model complexity using cross-validation and independent test set validation for predicting gene expression in three published datasets related to myogenesis and muscle differentiation. Furthermore, we demonstrate that models trained on simpler datasets can be used to identify interactions among genes and select the most informative. We also show that these models can explain the myogenesis-related genes (genes of interest) significantly better than others (P < 0.004) since the improvement in their rankings is much more pronounced. Finally, after further evaluating our results on synthetic datasets, we show that our approach outperforms a concordance method by Lai et al. in identifying informative genes from multiple datasets with increasing complexity whilst additionally modelling the interaction between genes. Conclusions We show that Bayesian networks derived from simpler controlled systems have better performance than those trained on datasets from more complex biological systems. Further, we present that highly predictive and consistent genes, from the pool of differentially expressed genes, across independent datasets are more likely to be fundamentally involved in the biological process under study. We conclude that networks trained on simpler controlled systems, such as in vitro experiments, can be used to model and capture interactions among genes in more complex datasets, such as in vivo experiments, where these interactions would otherwise be concealed by a multitude of other ongoing events

    Machine learning algorithms for systematic review: reducing workload in a preclinical review of animal studies and reducing human screening error

    Get PDF
    BACKGROUND: Here, we outline a method of applying existing machine learning (ML) approaches to aid citation screening in an on-going broad and shallow systematic review of preclinical animal studies. The aim is to achieve a high-performing algorithm comparable to human screening that can reduce human resources required for carrying out this step of a systematic review. METHODS: We applied ML approaches to a broad systematic review of animal models of depression at the citation screening stage. We tested two independently developed ML approaches which used different classification models and feature sets. We recorded the performance of the ML approaches on an unseen validation set of papers using sensitivity, specificity and accuracy. We aimed to achieve 95% sensitivity and to maximise specificity. The classification model providing the most accurate predictions was applied to the remaining unseen records in the dataset and will be used in the next stage of the preclinical biomedical sciences systematic review. We used a cross-validation technique to assign ML inclusion likelihood scores to the human screened records, to identify potential errors made during the human screening process (error analysis). RESULTS: ML approaches reached 98.7% sensitivity based on learning from a training set of 5749 records, with an inclusion prevalence of 13.2%. The highest level of specificity reached was 86%. Performance was assessed on an independent validation dataset. Human errors in the training and validation sets were successfully identified using the assigned inclusion likelihood from the ML model to highlight discrepancies. Training the ML algorithm on the corrected dataset improved the specificity of the algorithm without compromising sensitivity. Error analysis correction leads to a 3% improvement in sensitivity and specificity, which increases precision and accuracy of the ML algorithm. CONCLUSIONS: This work has confirmed the performance and application of ML algorithms for screening in systematic reviews of preclinical animal studies. It has highlighted the novel use of ML algorithms to identify human error. This needs to be confirmed in other reviews with different inclusion prevalence levels, but represents a promising approach to integrating human decisions and automation in systematic review methodology

    Dissecting complex transcriptional responses using pathway-level scores based on prior information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genomewide pattern of changes in mRNA expression measured using DNA microarrays is typically a complex superposition of the response of multiple regulatory pathways to changes in the environment of the cells. The use of prior information, either about the function of the protein encoded by each gene, or about the physical interactions between regulatory factors and the sequences controlling its expression, has emerged as a powerful approach for dissecting complex transcriptional responses.</p> <p>Results</p> <p>We review two different approaches for combining the noisy expression levels of multiple individual genes into robust pathway-level differential expression scores. The first is based on a comparison between the distribution of expression levels of genes within a predefined gene set and those of all other genes in the genome. The second starts from an estimate of the strength of genomewide regulatory network connectivities based on sequence information or direct measurements of protein-DNA interactions, and uses regression analysis to estimate the activity of gene regulatory pathways. The statistical methods used are explained in detail.</p> <p>Conclusion</p> <p>By avoiding the thresholding of individual genes, pathway-level analysis of differential expression based on prior information can be considerably more sensitive to subtle changes in gene expression than gene-level analysis. The methods are technically straightforward and yield results that are easily interpretable, both biologically and statistically.</p

    Field's Logic of Truth

    Get PDF
    Saving Truth from Paradox is a re-exciting development. The 70s and 80s were a time of excitement among people working on the semantic paradoxes. There were continual formal developments, with the constant hope that these results would yield deep insights. The enthusiasm wore off, however, as people became more cognizant of the disparity between what they had accomplished, impressive as it was, and what they had hoped to accomplish. They moved onto other problems that they hoped would prove more yielding. That, at least, was how it seemed to me, so I was delighted to see a dramatically new formal development that is likely to rekindle our enthusiasm

    Empowerment or Engagement? Digital Health Technologies for Mental Healthcare

    Get PDF
    We argue that while digital health technologies (e.g. artificial intelligence, smartphones, and virtual reality) present significant opportunities for improving the delivery of healthcare, key concepts that are used to evaluate and understand their impact can obscure significant ethical issues related to patient engagement and experience. Specifically, we focus on the concept of empowerment and ask whether it is adequate for addressing some significant ethical concerns that relate to digital health technologies for mental healthcare. We frame these concerns using five key ethical principles for AI ethics (i.e. autonomy, beneficence, non-maleficence, justice, and explicability), which have their roots in the bioethical literature, in order to critically evaluate the role that digital health technologies will have in the future of digital healthcare

    Surveillance in ubiquitous network societies: Normative conflicts related to the consumer in-store supermarket experience in the context of the Internet of Things

    Get PDF
    Peer-reviewed journal articleThe Internet of Things (IoT) is an emerging global infrastructure that employs wireless sensors to collect, store, and exchange data. Increasingly, applications for marketing and advertising have been articulated as a means to enhance the consumer shopping experience, in addition to improving efficiency. However, privacy advocates have challenged the mass aggregation of personally identifiable information in databases and geotracking, the use of location-based services to identify one’s precise location over time. This paper employs the framework of contextual integrity related to privacy developed by Nissenbaum (Privacy in context: technology, policy, and the integrity of social life. Stanford University Press, Stanford, 2010) as a tool to understand citizen response to implementation IoT-related technology in the supermarket. The purpose of the study was to identify and understand specific changes in information practices brought about by the IoT that may be perceived as privacy violations. Citizens were interviewed, read a scenario of near-term IoT implementation, and were asked to reflect on changes in the key actors involved, information attributes, and principles of transmission. Areas where new practices may occur with the IoT were then highlighted as potential problems (privacy violations). Issues identified included the mining of medical data, invasive targeted advertising, and loss of autonomy through marketing profiles or personal affect monitoring. While there were numerous aspects deemed desirable by the participants, some developments appeared to tip the balance between consumer benefit and corporate gain. This surveillance power creates an imbalance between the consumer and the corporation that may also impact individual autonomy. The ethical dimensions of this problem are discussed

    FISH as an effective diagnostic tool for the management of challenging melanocytic lesions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The accuracy of melanoma diagnosis continues to challenge the pathology community, even today with sophisticated histopathologic techniques. Melanocytic lesions exhibit significant morphological heterogeneity. While the majority of biopsies can be classified as benign (nevus) or malignant (melanoma) using well-established histopathologic criteria, there exists a cohort for which the prediction of clinical behaviour and invasive or metastatic potential is difficult if not impossible to ascertain on the basis of morphological features alone. Multiple studies have shown that there is significant disagreement between pathologists and even expert dermatopathologists in the diagnosis of this subgroup of difficult melanocytic lesions.</p> <p>Methods</p> <p>A four probe FISH assay was utilized to analyse a cohort of 500 samples including 157 nevus, 176 dysplastic nevus and 167 melanoma specimens.</p> <p>Results</p> <p>Review of the lesions determined the assay identified genetic abnormalities in a total of 83.8% of melanomas, and 1.9% of nevus without atypia, while genetic abnormalities were identified in 6.3%, 6.7%, and 10.3% of nevus identified with mild, moderate and severe atypia, respectively.</p> <p>Conclusions</p> <p>Based on this study, inheritable genetic damage/instability identified by FISH testing is a hallmark of a progressive malignant process, and a valuable diagnostic tool for the identification of high risk lesions.</p
    • 

    corecore